Goto

Collaborating Authors

 change detection


Segment Any Change

Neural Information Processing Systems

Visual foundation models have achieved remarkable results in zero-shot image classification and segmentation, but zero-shot change detection remains an open problem. In this paper, we propose the segment any change models (AnyChange), a new type of change detection model that supports zero-shot prediction and generalization on unseen change types and data distributions.AnyChange is built on the segment anything model (SAM) via our training-free adaptation method, bitemporal latent matching.By revealing and exploiting intra-image and inter-image semantic similarities in SAM's latent space, bitemporal latent matching endows SAM with zero-shot change detection capabilities in a training-free way. We also propose a point query mechanism to enable AnyChange's zero-shot object-centric change detection capability.We perform extensive experiments to confirm the effectiveness of AnyChange for zero-shot change detection.AnyChange sets a new record on the SECOND benchmark for unsupervised change detection, exceeding the previous SOTA by up to 4.4\% F$_1$ score, and achieving comparable accuracy with negligible manual annotations (1 pixel per image) for supervised change detection.


SAM Guided Semantic and Motion Changed Region Mining for Remote Sensing Change Captioning

Wang, Futian, Wang, Mengqi, Wang, Xiao, Wang, Haowen, Tang, Jin

arXiv.org Artificial Intelligence

Remote sensing change captioning is an emerging and popular research task that aims to describe, in natural language, the content of interest that has changed between two remote sensing images captured at different times. Existing methods typically employ CNNs/Transformers to extract visual representations from the given images or incorporate auxiliary tasks to enhance the final results, with weak region awareness and limited temporal alignment. To address these issues, this paper explores the use of the SAM (Segment Anything Model) foundation model to extract region-level representations and inject region-of-interest knowledge into the captioning framework. Specifically, we employ a CNN/Transformer model to extract global-level vision features, leverage the SAM foundation model to delineate semantic- and motion-level change regions, and utilize a specially constructed knowledge graph to provide information about objects of interest. These heterogeneous sources of information are then fused via cross-attention, and a Transformer decoder is used to generate the final natural language description of the observed changes. Extensive experimental results demonstrate that our method achieves state-of-the-art performance across multiple widely used benchmark datasets. The source code of this paper will be released on https://github.com/Event-AHU/SAM_ChangeCaptioning



3D Semantic Understanding from Monocular Remote Sensing Imagery

Neural Information Processing Systems

Section A.1 outlines the generation process of the SynRS3D dataset, including the tools and It also covers the licenses for these plugins. Section A.4 describes the experimental setup and the selection of hyperparameters for the RS3DAda method. Section A.5 presents the ablation study results and analysis for the RS3DAda method. Section A.6 provides supplementary experimental The generation workflow of SynRS3D involves several key steps, from initializing sensor and sunlight parameters to generating the layout, geometry, and textures of the scene. Initialization: Set up the sensor and sunlight parameters using uniform and normal distributions to simulate various conditions.



RoS-Guard: Robust and Scalable Online Change Detection with Delay-Optimal Guarantees

Zhu, Zelin, Huang, Yancheng, Yang, Kai

arXiv.org Artificial Intelligence

Online change detection (OCD) aims to rapidly identify change points in streaming data and is critical in applications such as power system monitoring, wireless network sensing, and financial anomaly detection. Existing OCD methods typically assume precise system knowledge, which is unrealistic due to estimation errors and environmental variations. Moreover, existing OCD methods often struggle with efficiency in large-scale systems. To overcome these challenges, we propose RoS-Guard, a robust and optimal OCD algorithm tailored for linear systems with uncertainty. Through a tight relaxation and reformulation of the OCD optimization problem, RoS-Guard employs neural unrolling to enable efficient parallel computation via GPU acceleration. The algorithm provides theoretical guarantees on performance, including expected false alarm rate and worst-case average detection delay. Extensive experiments validate the effectiveness of RoS-Guard and demonstrate significant computational speedup in large-scale system scenarios.



Bandit Quickest Changepoint Detection

Neural Information Processing Systems

Surveillance systems [HC11] are equipped with a suite of sensors that can be switched and steered to focus attention on any target or location over a physical landscape (see Figure 1) to detect abrupt changes at any location. On the other hand, sensor suites are resource limited, and only a limited subset, among all the locations, can be probed at any time.


Bandit Quickest Changepoint Detection

Neural Information Processing Systems

Surveillance systems [HC11] are equipped with a suite of sensors that can be switched and steered to focus attention on any target or location over a physical landscape (see Figure 1) to detect abrupt changes at any location. On the other hand, sensor suites are resource limited, and only a limited subset, among all the locations, can be probed at any time.